Search CORE

182 research outputs found

True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4

Author: Del Maksym
Fishel Mark
Publication venue
Publication date: 01/06/2023
Field of study

Large language models (LLMs) have demonstrated solid zero-shot reasoning capabilities, which is reflected in their performance on the current test tasks. This calls for a more challenging benchmark requiring highly advanced reasoning ability to be solved. In this paper, we introduce such a benchmark, consisting of 191 long-form (1200 words on average) mystery narratives constructed as detective puzzles. Puzzles are sourced from the "5 Minute Mystery" platform and include a multiple-choice question for evaluation. Only 47% of humans solve a puzzle successfully on average, while the best human solvers achieve over 80% success rate. We show that GPT-3 models barely outperform random on this benchmark (with 28% accuracy) while state-of-the-art GPT-4 solves only 38% of puzzles. This indicates that there is still a significant gap in the deep reasoning abilities of LLMs and humans and highlights the need for further research in this area. Our work introduces a challenging benchmark for future studies on reasoning in language models and contributes to a better understanding of the limits of LLMs' abilities.Comment: 5 pages, to appear at *SE

arXiv.org e-Print Archive

Multi-Domain Neural Machine Translation

Author: Fishel Mark
Tars Sander
Publication venue
Publication date: 01/01/2018
Field of study

We present an approach to neural machine translation (NMT) that supports multiple domains in a single model and allows switching between the domains when translating. The core idea is to treat text domains as distinct languages and use multilingual NMT methods to create multi-domain translation systems, we show that this approach results in significant translation quality gains over fine-tuning. We also explore whether the knowledge of pre-specified text domains is necessary, turns out that it is after all, but also that when it is not known quite high translation quality can be reached.Comment: Accepted to EAMT'2018, In Proceedings of the 21st Annual Conference of the European Association for Machine Translation (EAMT'2018

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

Distilling Estonian Text Domains for Production-Oriented Machine Translation

Author: Fishel Mark
Korotkova Elizaveta
Publication venue: University of Tartu Library
Publication date: 01/05/2023
Field of study

DSpace at Tartu University Library

Voting and Stacking in Data-Driven Dependency Parsing

Author: Fishel Mark
Nivre Joakim
Publication venue
Publication date: 13/05/2009
Field of study

Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 219-222. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

DSpace at Tartu University Library

DNA Repair Proteins as Molecular Targets for Cancer Therapeutics

Author: Fishel Melissa L.
Kelley Mark R.
Publication venue
Publication date: 01/05/2008
Field of study

Cancer therapeutics include an ever-increasing array of tools at the disposal of clinicians in their treatment of this disease. However, cancer is a tough opponent in this battle and current treatments which typically include radiotherapy, chemotherapy and surgery are not often enough to rid the patient of his or her cancer. Cancer cells can become resistant to the treatments directed at them and overcoming this drug resistance is an important research focus. Additionally, increasing discussion and research is centering on targeted and individualized therapy. While a number of approaches have undergone intensive and close scrutiny as potential approaches to treat and kill cancer (signaling pathways, multidrug resistance, cell cycle checkpoints, anti-angiogenesis, etc.), much less work has focused on blocking the ability of a cancer cell to recognize and repair the damaged DNA which primarily results from the front line cancer treatments; chemotherapy and radiation. More recent studies on a number of DNA repair targets have produced proof-of-concept results showing that selective targeting of these DNA repair enzymes has the potential to enhance and augment the currently used chemotherapeutic agents and radiation as well as overcoming drug resistance. Some of the targets identified result in the development of effective single-agent anti-tumor molecules. While it is inherently convoluted to think that inhibiting DNA repair processes would be a likely approach to kill cancer cells, careful identification of specific DNA repair proteins is increasingly appearing to be a viable approach in the cancer therapeutic cache

IUPUIScholarWorks

Mixing and blending syntactic and semantic dependencies

Author: Eklund Johan
Fishel Mark
Saers Markus
Samuelsson Yvonne
Täckström Oscar
Velupillai Sumithra
Publication venue: Coling 2008 Organizing Committee
Publication date: 01/01/2008
Field of study

Our system for the CoNLL 2008 shared task uses a set of individual parsers, a set of stand-alone semantic role labellers, and a joint system for parsing and semantic role labelling, all blended together. The system achieved a macro averaged labelled F1- score of 79.79 (WSJ 80.92, Brown 70.49) for the overall task. The labelled attachment score for syntactic dependencies was 86.63 (WSJ 87.36, Brown 80.77) and the labelled F1-score for semantic dependencies was 72.94 (WSJ 74.47, Brown 60.18)

CiteSeerX

Crossref

Publikationer från Uppsala Universitet

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Findings of the 2019 Conference on Machine Translation (WMT19)

Author: Barrault Loïc
Bojar Ondřej
Costa-Jussà Marta R.
Federmann Christian
Fishel Mark
Graham Yvette
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/08/2019
Field of study

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation

Irish Universities

DCU Online Research Access Service